Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 9702 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 1.2 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 7 |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
Unnamed: 0 is highly correlated with ID | High correlation |
ID is highly correlated with Unnamed: 0 | High correlation |
CODE_GENDER is highly correlated with FLAG_OWN_CAR and 1 other fields | High correlation |
FLAG_OWN_CAR is highly correlated with CODE_GENDER | High correlation |
NAME_INCOME_TYPE is highly correlated with OCCUPATION_TYPE and 1 other fields | High correlation |
NAME_FAMILY_STATUS is highly correlated with CNT_FAM_MEMBERS | High correlation |
OCCUPATION_TYPE is highly correlated with CODE_GENDER and 1 other fields | High correlation |
CNT_FAM_MEMBERS is highly correlated with NAME_FAMILY_STATUS | High correlation |
AGE is highly correlated with NAME_INCOME_TYPE | High correlation |
Unnamed: 0 is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
ID has unique values | Unique |
OCCUPATION_TYPE has 300 (3.1%) zeros | Zeros |
YEARS_EMPLOYED has 1694 (17.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-05-10 22:14:30.277308 |
|---|---|
| Analysis finished | 2022-05-10 22:15:06.760447 |
| Duration | 36.48 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
Unnamed: 0
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIFORMUNIQUE| Distinct | 9702 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4850.5 |
| Minimum | 0 |
|---|---|
| Maximum | 9701 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 485.05 |
| Q1 | 2425.25 |
| median | 4850.5 |
| Q3 | 7275.75 |
| 95-th percentile | 9215.95 |
| Maximum | 9701 |
| Range | 9701 |
| Interquartile range (IQR) | 4850.5 |
Descriptive statistics
| Standard deviation | 2800.87049 |
|---|---|
| Coefficient of variation (CV) | 0.5774395402 |
| Kurtosis | -1.2 |
| Mean | 4850.5 |
| Median Absolute Deviation (MAD) | 2425.5 |
| Skewness | 0 |
| Sum | 47059551 |
| Variance | 7844875.5 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 0 | 1 | < 0.1% |
| 6462 | 1 | < 0.1% |
| 6464 | 1 | < 0.1% |
| 6465 | 1 | < 0.1% |
| 6466 | 1 | < 0.1% |
| 6467 | 1 | < 0.1% |
| 6468 | 1 | < 0.1% |
| 6469 | 1 | < 0.1% |
| 6470 | 1 | < 0.1% |
| 6471 | 1 | < 0.1% |
| Other values (9692) | 9692 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 9701 | 1 | |
| 9700 | 1 | |
| 9699 | 1 | |
| 9698 | 1 | |
| 9697 | 1 | |
| 9696 | 1 | |
| 9695 | 1 | |
| 9694 | 1 | |
| 9693 | 1 | |
| 9692 | 1 |
| Distinct | 9702 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5076115.117 |
| Minimum | 5008804 |
|---|---|
| Maximum | 5150479 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 5008804 |
|---|---|
| 5-th percentile | 5021597.15 |
| Q1 | 5036955.75 |
| median | 5069452.5 |
| Q3 | 5112987.75 |
| 95-th percentile | 5143323.95 |
| Maximum | 5150479 |
| Range | 141675 |
| Interquartile range (IQR) | 76032 |
Descriptive statistics
| Standard deviation | 40807.0046 |
|---|---|
| Coefficient of variation (CV) | 0.00803902269 |
| Kurtosis | -1.208459014 |
| Mean | 5076115.117 |
| Median Absolute Deviation (MAD) | 35512 |
| Skewness | 0.1265984442 |
| Sum | 4.924846886 × 1010 |
| Variance | 1665211625 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5008804 | 1 | < 0.1% |
| 5097105 | 1 | < 0.1% |
| 5097132 | 1 | < 0.1% |
| 5097136 | 1 | < 0.1% |
| 5097146 | 1 | < 0.1% |
| 5097148 | 1 | < 0.1% |
| 5097151 | 1 | < 0.1% |
| 5097154 | 1 | < 0.1% |
| 5097155 | 1 | < 0.1% |
| 5097157 | 1 | < 0.1% |
| Other values (9692) | 9692 |
| Value | Count | Frequency (%) |
| 5008804 | 1 | |
| 5008806 | 1 | |
| 5008808 | 1 | |
| 5008812 | 1 | |
| 5008815 | 1 | |
| 5008819 | 1 | |
| 5008825 | 1 | |
| 5008827 | 1 | |
| 5008830 | 1 | |
| 5008834 | 1 |
| Value | Count | Frequency (%) |
| 5150479 | 1 | |
| 5150467 | 1 | |
| 5150459 | 1 | |
| 5150451 | 1 | |
| 5150428 | 1 | |
| 5150410 | 1 | |
| 5150400 | 1 | |
| 5150388 | 1 | |
| 5150338 | 1 | |
| 5150337 | 1 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 6318 | |
| 1 | 3384 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 6318 | |
| 1 | 3384 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 6135 | |
| 1 | 3567 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 6135 | |
| 1 | 3567 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
FLAG_OWN_REALTY
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 6514 | |
| 0 | 3188 |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 6514 | |
| 0 | 3188 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
AMT_INCOME_TOTAL
Real number (ℝ≥0)
| Distinct | 263 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 181219.8043 |
| Minimum | 27000 |
|---|---|
| Maximum | 1575000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 27000 |
|---|---|
| 5-th percentile | 67500 |
| Q1 | 112500 |
| median | 157500 |
| Q3 | 225000 |
| 95-th percentile | 360000 |
| Maximum | 1575000 |
| Range | 1548000 |
| Interquartile range (IQR) | 112500 |
Descriptive statistics
| Standard deviation | 99302.19023 |
|---|---|
| Coefficient of variation (CV) | 0.5479654425 |
| Kurtosis | 15.77629382 |
| Mean | 181219.8043 |
| Median Absolute Deviation (MAD) | 45000 |
| Skewness | 2.659466621 |
| Sum | 1758194541 |
| Variance | 9860924985 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 135000 | 1138 | 11.7% |
| 180000 | 843 | 8.7% |
| 112500 | 842 | 8.7% |
| 157500 | 829 | 8.5% |
| 225000 | 749 | 7.7% |
| 202500 | 558 | 5.8% |
| 90000 | 535 | 5.5% |
| 270000 | 425 | 4.4% |
| 67500 | 265 | 2.7% |
| 315000 | 232 | 2.4% |
| Other values (253) | 3286 |
| Value | Count | Frequency (%) |
| 27000 | 2 | |
| 29250 | 1 | < 0.1% |
| 30150 | 1 | < 0.1% |
| 31500 | 3 | |
| 31531.5 | 1 | < 0.1% |
| 31950 | 1 | < 0.1% |
| 32400 | 1 | < 0.1% |
| 33300 | 2 | |
| 33750 | 1 | < 0.1% |
| 36000 | 4 |
| Value | Count | Frequency (%) |
| 1575000 | 1 | < 0.1% |
| 1350000 | 1 | < 0.1% |
| 1125000 | 3 | < 0.1% |
| 990000 | 1 | < 0.1% |
| 945000 | 1 | < 0.1% |
| 900000 | 10 | |
| 810000 | 6 | |
| 787500 | 1 | < 0.1% |
| 765000 | 2 | < 0.1% |
| 742500 | 1 | < 0.1% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 4 | |
|---|---|
| 0 | |
| 1 | |
| 2 | |
| 3 | 3 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 4 |
|---|---|
| 2nd row | 4 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 4 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 4956 | |
| 0 | 2312 | |
| 1 | 1710 | 17.6% |
| 2 | 721 | 7.4% |
| 3 | 3 | < 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 4 | 4956 | |
| 0 | 2312 | |
| 1 | 1710 | 17.6% |
| 2 | 721 | 7.4% |
| 3 | 3 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
NAME_EDUCATION_TYPE
Categorical
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 4 | |
|---|---|
| 1 | |
| 2 | 371 |
| 3 | 114 |
| 0 | 6 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 4 |
| 3rd row | 4 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 6757 | |
| 1 | 2454 | 25.3% |
| 2 | 371 | 3.8% |
| 3 | 114 | 1.2% |
| 0 | 6 | 0.1% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 4 | 6757 | |
| 1 | 2454 | 25.3% |
| 2 | 371 | 3.8% |
| 3 | 114 | 1.2% |
| 0 | 6 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 1 | |
|---|---|
| 3 | |
| 0 | |
| 2 | 572 |
| 4 | 410 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 3 |
| 4th row | 2 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 6526 | |
| 3 | 1358 | 14.0% |
| 0 | 836 | 8.6% |
| 2 | 572 | 5.9% |
| 4 | 410 | 4.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 1 | 6526 | |
| 3 | 1358 | 14.0% |
| 0 | 836 | 8.6% |
| 2 | 572 | 5.9% |
| 4 | 410 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
NAME_HOUSING_TYPE
Real number (ℝ≥0)
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.274685632 |
| Minimum | 0 |
|---|---|
| Maximum | 5 |
| Zeros | 34 |
| Zeros (%) | 0.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 1 |
| 95-th percentile | 4 |
| Maximum | 5 |
| Range | 5 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.930142303 |
|---|---|
| Coefficient of variation (CV) | 0.7297032929 |
| Kurtosis | 10.10810157 |
| Mean | 1.274685632 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.374696287 |
| Sum | 12367 |
| Variance | 0.8651647038 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 8677 | |
| 5 | 448 | 4.6% |
| 2 | 323 | 3.3% |
| 4 | 144 | 1.5% |
| 3 | 76 | 0.8% |
| 0 | 34 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 34 | 0.4% |
| 1 | 8677 | |
| 2 | 323 | 3.3% |
| 3 | 76 | 0.8% |
| 4 | 144 | 1.5% |
| 5 | 448 | 4.6% |
| Value | Count | Frequency (%) |
| 5 | 448 | 4.6% |
| 4 | 144 | 1.5% |
| 3 | 76 | 0.8% |
| 2 | 323 | 3.3% |
| 1 | 8677 | |
| 0 | 34 | 0.4% |
| Distinct | 19 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.220882292 |
| Minimum | 0 |
|---|---|
| Maximum | 18 |
| Zeros | 300 |
| Zeros (%) | 3.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 6 |
| median | 10 |
| Q3 | 12 |
| 95-th percentile | 15 |
| Maximum | 18 |
| Range | 18 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 4.274346628 |
|---|---|
| Coefficient of variation (CV) | 0.4635507202 |
| Kurtosis | -0.6975556125 |
| Mean | 9.220882292 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.4240024851 |
| Sum | 89461 |
| Variance | 18.27003909 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 2992 | |
| 8 | 1724 | |
| 15 | 958 | 9.9% |
| 3 | 875 | 9.0% |
| 10 | 782 | 8.1% |
| 4 | 622 | 6.4% |
| 6 | 357 | 3.7% |
| 0 | 300 | 3.1% |
| 11 | 291 | 3.0% |
| 2 | 193 | 2.0% |
| Other values (9) | 608 | 6.3% |
| Value | Count | Frequency (%) |
| 0 | 300 | 3.1% |
| 1 | 146 | 1.5% |
| 2 | 193 | 2.0% |
| 3 | 875 | |
| 4 | 622 | 6.4% |
| 5 | 22 | 0.2% |
| 6 | 357 | 3.7% |
| 7 | 18 | 0.2% |
| 8 | 1724 | |
| 9 | 53 | 0.5% |
| Value | Count | Frequency (%) |
| 18 | 39 | 0.4% |
| 17 | 182 | 1.9% |
| 16 | 46 | 0.5% |
| 15 | 958 | 9.9% |
| 14 | 16 | 0.2% |
| 13 | 86 | 0.9% |
| 12 | 2992 | |
| 11 | 291 | 3.0% |
| 10 | 782 | 8.1% |
| 9 | 53 | 0.5% |
CNT_FAM_MEMBERS
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 8 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.179550608 |
| Minimum | 1 |
|---|---|
| Maximum | 9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 9 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9062441788 |
|---|---|
| Coefficient of variation (CV) | 0.4157940519 |
| Kurtosis | 1.416363506 |
| Mean | 2.179550608 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 0.9572669926 |
| Sum | 21146 |
| Variance | 0.8212785116 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 5178 | |
| 1 | 1947 | 20.1% |
| 3 | 1635 | 16.9% |
| 4 | 802 | 8.3% |
| 5 | 117 | 1.2% |
| 6 | 18 | 0.2% |
| 7 | 4 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 1947 | 20.1% |
| 2 | 5178 | |
| 3 | 1635 | 16.9% |
| 4 | 802 | 8.3% |
| 5 | 117 | 1.2% |
| 6 | 18 | 0.2% |
| 7 | 4 | < 0.1% |
| 9 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 9 | 1 | < 0.1% |
| 7 | 4 | < 0.1% |
| 6 | 18 | 0.2% |
| 5 | 117 | 1.2% |
| 4 | 802 | 8.3% |
| 3 | 1635 | 16.9% |
| 2 | 5178 | |
| 1 | 1947 | 20.1% |
| Distinct | 7171 |
|---|---|
| Distinct (%) | 73.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 43.78130175 |
| Minimum | 20.50418558 |
|---|---|
| Maximum | 68.86383704 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 20.50418558 |
|---|---|
| 5-th percentile | 26.59383834 |
| Q1 | 34.05545631 |
| median | 42.73735943 |
| Q3 | 53.56646611 |
| 95-th percentile | 63.0207328 |
| Maximum | 68.86383704 |
| Range | 48.35965146 |
| Interquartile range (IQR) | 19.51100981 |
Descriptive statistics
| Standard deviation | 11.62574179 |
|---|---|
| Coefficient of variation (CV) | 0.2655412545 |
| Kurtosis | -1.053156755 |
| Mean | 43.78130175 |
| Median Absolute Deviation (MAD) | 9.548450687 |
| Skewness | 0.1505990557 |
| Sum | 424766.1896 |
| Variance | 135.1578722 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 28.54268055 | 6 | 0.1% |
| 38.66472275 | 5 | 0.1% |
| 57.0470304 | 5 | 0.1% |
| 48.29941751 | 5 | 0.1% |
| 58.35027413 | 5 | 0.1% |
| 52.10784616 | 5 | 0.1% |
| 56.64455807 | 5 | 0.1% |
| 40.86599999 | 5 | 0.1% |
| 58.48990739 | 5 | 0.1% |
| 55.04014456 | 5 | 0.1% |
| Other values (7161) | 9651 |
| Value | Count | Frequency (%) |
| 20.50418558 | 1 | |
| 21.09557349 | 1 | |
| 21.14485581 | 1 | |
| 21.23794465 | 1 | |
| 21.79100187 | 1 | |
| 21.84849792 | 1 | |
| 22.01551024 | 1 | |
| 22.05110303 | 1 | |
| 22.05657885 | 1 | |
| 22.08669583 | 1 |
| Value | Count | Frequency (%) |
| 68.86383704 | 1 | |
| 68.83098216 | 1 | |
| 68.71872797 | 1 | |
| 68.68861099 | 1 | |
| 68.47505424 | 1 | |
| 68.36553796 | 1 | |
| 68.34637262 | 1 | |
| 68.2998282 | 1 | |
| 68.2614975 | 1 | |
| 68.21221517 | 1 |
| Distinct | 3636 |
|---|---|
| Distinct (%) | 37.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.666517901 |
| Minimum | 0 |
|---|---|
| Maximum | 43.0207328 |
| Zeros | 1694 |
| Zeros (%) | 17.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.9288349521 |
| median | 3.761884228 |
| Q3 | 8.202769393 |
| 95-th percentile | 18.79778503 |
| Maximum | 43.0207328 |
| Range | 43.0207328 |
| Interquartile range (IQR) | 7.273934441 |
Descriptive statistics
| Standard deviation | 6.343724493 |
|---|---|
| Coefficient of variation (CV) | 1.119510183 |
| Kurtosis | 4.219012656 |
| Mean | 5.666517901 |
| Median Absolute Deviation (MAD) | 3.27042992 |
| Skewness | 1.8443603 |
| Sum | 54976.55667 |
| Variance | 40.24284044 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 1694 | 17.5% |
| 0.5475814014 | 17 | 0.2% |
| 1.09790071 | 17 | 0.2% |
| 2.798140961 | 14 | 0.1% |
| 0.3449762829 | 14 | 0.1% |
| 1.839873509 | 13 | 0.1% |
| 1.259437223 | 13 | 0.1% |
| 0.6790009377 | 13 | 0.1% |
| 0.6817388447 | 13 | 0.1% |
| 2.01236165 | 12 | 0.1% |
| Other values (3626) | 7882 |
| Value | Count | Frequency (%) |
| 0 | 1694 | |
| 0.04654441912 | 1 | < 0.1% |
| 0.1177300013 | 1 | < 0.1% |
| 0.1779639555 | 1 | < 0.1% |
| 0.1807018625 | 1 | < 0.1% |
| 0.1916534905 | 2 | < 0.1% |
| 0.1943913975 | 1 | < 0.1% |
| 0.1998672115 | 3 | < 0.1% |
| 0.2135567465 | 1 | < 0.1% |
| 0.2162946536 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 43.0207328 | 1 | |
| 42.87836164 | 1 | |
| 41.69011 | 1 | |
| 41.26573441 | 1 | |
| 41.17264557 | 1 | |
| 40.75922161 | 1 | |
| 40.54840277 | 1 | |
| 40.45257603 | 1 | |
| 39.79821625 | 1 | |
| 39.62572811 | 1 |
STATUS
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 75.9 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 8424 | |
| 1 | 1278 | 13.2% |
Length
Pie chart
| Value | Count | Frequency (%) |
| 0 | 8424 | |
| 1 | 1278 | 13.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
MONTHS_BALANCE
Real number (ℝ≥0)
| Distinct | 61 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.26303855 |
| Minimum | 0 |
|---|---|
| Maximum | 60 |
| Zeros | 57 |
| Zeros (%) | 0.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 75.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 13 |
| median | 26 |
| Q3 | 41 |
| 95-th percentile | 56 |
| Maximum | 60 |
| Range | 60 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 16.64688326 |
|---|---|
| Coefficient of variation (CV) | 0.6106026382 |
| Kurtosis | -1.089553494 |
| Mean | 27.26303855 |
| Median Absolute Deviation (MAD) | 14 |
| Skewness | 0.2170390491 |
| Sum | 264506 |
| Variance | 277.1187224 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 219 | 2.3% |
| 13 | 216 | 2.2% |
| 7 | 215 | 2.2% |
| 16 | 212 | 2.2% |
| 15 | 210 | 2.2% |
| 5 | 210 | 2.2% |
| 18 | 206 | 2.1% |
| 39 | 201 | 2.1% |
| 6 | 197 | 2.0% |
| 3 | 196 | 2.0% |
| Other values (51) | 7620 |
| Value | Count | Frequency (%) |
| 0 | 57 | 0.6% |
| 1 | 136 | |
| 2 | 167 | |
| 3 | 196 | |
| 4 | 183 | |
| 5 | 210 | |
| 6 | 197 | |
| 7 | 215 | |
| 8 | 189 | |
| 9 | 185 |
| Value | Count | Frequency (%) |
| 60 | 100 | |
| 59 | 98 | |
| 58 | 111 | |
| 57 | 90 | |
| 56 | 114 | |
| 55 | 101 | |
| 54 | 102 | |
| 53 | 115 | |
| 52 | 104 | |
| 51 | 136 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| Unnamed: 0 | ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | OCCUPATION_TYPE | CNT_FAM_MEMBERS | AGE | YEARS_EMPLOYED | STATUS | MONTHS_BALANCE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 5008804 | 1 | 1 | 1 | 427500.0 | 4 | 1 | 0 | 4 | 12 | 2 | 32.868574 | 12.435574 | 1 | 15 |
| 1 | 1 | 5008806 | 1 | 1 | 1 | 112500.0 | 4 | 4 | 1 | 1 | 17 | 2 | 58.793815 | 3.104787 | 0 | 29 |
| 2 | 2 | 5008808 | 0 | 0 | 1 | 270000.0 | 0 | 4 | 3 | 1 | 15 | 1 | 52.321403 | 8.353354 | 0 | 4 |
| 3 | 3 | 5008812 | 0 | 0 | 1 | 283500.0 | 1 | 1 | 2 | 1 | 12 | 1 | 61.504343 | 0.000000 | 0 | 20 |
| 4 | 4 | 5008815 | 1 | 1 | 1 | 270000.0 | 4 | 1 | 1 | 1 | 0 | 2 | 46.193967 | 2.105450 | 0 | 5 |
| 5 | 5 | 5008819 | 1 | 1 | 1 | 135000.0 | 0 | 4 | 1 | 1 | 8 | 2 | 48.674511 | 3.269061 | 0 | 17 |
| 6 | 6 | 5008825 | 0 | 1 | 0 | 130500.0 | 4 | 2 | 1 | 1 | 0 | 2 | 29.210730 | 3.019911 | 1 | 25 |
| 7 | 7 | 5008830 | 0 | 0 | 1 | 157500.0 | 4 | 4 | 1 | 1 | 8 | 2 | 27.463945 | 4.021985 | 1 | 31 |
| 8 | 8 | 5008834 | 0 | 0 | 1 | 112500.0 | 4 | 4 | 3 | 1 | 12 | 2 | 30.029364 | 4.435409 | 0 | 44 |
| 9 | 9 | 5008836 | 1 | 1 | 1 | 270000.0 | 4 | 4 | 1 | 1 | 8 | 5 | 34.741302 | 3.184186 | 0 | 24 |
Last rows
| Unnamed: 0 | ID | CODE_GENDER | FLAG_OWN_CAR | FLAG_OWN_REALTY | AMT_INCOME_TOTAL | NAME_INCOME_TYPE | NAME_EDUCATION_TYPE | NAME_FAMILY_STATUS | NAME_HOUSING_TYPE | OCCUPATION_TYPE | CNT_FAM_MEMBERS | AGE | YEARS_EMPLOYED | STATUS | MONTHS_BALANCE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 9692 | 9692 | 5142973 | 1 | 0 | 0 | 180000.0 | 4 | 4 | 1 | 1 | 8 | 1 | 29.175137 | 2.535302 | 1 | 18 |
| 9693 | 9693 | 5143578 | 1 | 1 | 0 | 157500.0 | 4 | 2 | 3 | 5 | 4 | 2 | 24.980664 | 2.628391 | 1 | 14 |
| 9694 | 9694 | 5145690 | 0 | 0 | 1 | 306000.0 | 1 | 1 | 1 | 1 | 12 | 2 | 59.111412 | 0.000000 | 1 | 17 |
| 9695 | 9695 | 5145760 | 0 | 1 | 0 | 135000.0 | 4 | 1 | 1 | 1 | 12 | 2 | 42.349946 | 13.235042 | 1 | 10 |
| 9696 | 9696 | 5146078 | 0 | 0 | 1 | 108000.0 | 4 | 4 | 3 | 1 | 15 | 1 | 34.834391 | 3.099311 | 1 | 48 |
| 9697 | 9697 | 5148694 | 0 | 0 | 0 | 180000.0 | 1 | 4 | 0 | 2 | 8 | 2 | 56.400884 | 0.542106 | 1 | 20 |
| 9698 | 9698 | 5149055 | 0 | 0 | 1 | 112500.0 | 0 | 4 | 1 | 1 | 12 | 2 | 43.360233 | 7.375921 | 1 | 19 |
| 9699 | 9699 | 5149729 | 1 | 1 | 1 | 90000.0 | 4 | 4 | 1 | 1 | 12 | 2 | 52.296762 | 4.711938 | 1 | 21 |
| 9700 | 9700 | 5149838 | 0 | 0 | 1 | 157500.0 | 1 | 1 | 1 | 1 | 11 | 2 | 33.914454 | 3.627727 | 1 | 32 |
| 9701 | 9701 | 5150337 | 1 | 0 | 1 | 112500.0 | 4 | 4 | 3 | 4 | 8 | 1 | 25.155890 | 3.266323 | 1 | 13 |